Whole-Genome Variance Components Linkage Analysis Using Single-Nucleotide Polymorphisms Versus Microsatellites on Quantitative Traits of Derived Phenotypes grom Factor Analysis of Electroencephalogram Waves

نویسندگان

  • Yu
  • Yi
  • Yan Meng
  • Qianli Ma
  • John Farrell
  • Lindsay A Farrer
  • Marsha A Wilcox
  • Yi Yu
چکیده

Alcohol dependence is a serious public health problem. We studied data from families participating in the Collaborative Study on the Genetics of Alcoholism (COGA) and made available to participants in the Genetic Analysis Workshop 14 (GAW14) in order to search for genes predisposing to alcohol dependence. Using factor analysis, we identified four factors (F1, F2, F3, F4) related to the electroencephalogram traits. We conducted variance components linkage analysis with each of the factors. Our results using the Affymetrix single-nucleotide polymorphism dataset showed significant evidence for a novel linkage of F3 (factor comprised of the three midline channel EEG measures from the target case of the Visual Oddball experiment ttdt2, 3, 4) to chromosome 18 (LOD = 3.45). This finding was confirmed by analyses of the microsatellite data (LOD = 2.73) and Illumina SNP data (LOD = 3.30). We also demonstrated that, in a sample like the COGA data, a dense single-nucleotide polymorphism map provides better linkage signals than low-resolution microsatellite map with quantitative traits. Background Alcoholism is a complex disorder involving multiple genes likely interacting with one another and environmental factors. Quantitative endophenotypes, such as electroencephalogram (EEG) measurements, have been suggested as better indices of alcoholism susceptibility than the customary dichotomous affection status [1,2]. EGG data defined by different experimental designs were available to participants in Genetic Analysis Workshop 14 (GAW14). Since EEG phenotypes are correlated, it is likely that a smaller number of underlying dimensions contribute to the variance of these EEG phenotypes. Our aim was to identify the underlying factors for the EEG phenotypes and search for genes influencing the derived factors and increasing the risk of alcohol dependence. Methods Phenotypes and factor analysis We conducted a principal components analysis using the 12 EEG measures (ttth1-ttth4, ttdt1-ttdt4, and ntth1ntth4). EEG measures from the Visual Oddball experiment were represented as four letters followed by a number (ttth1-ttth4, ttdt1-ttdt4, and ntth1-ntth4). The four letters denote different experiment conditions: ttth_ contain extracted measures from the target case correspond to the 'late' time window, which is set at 300 to 700 from Genetic Analysis Workshop 14: Microsatellite and single-nucleotide polymorphism Noordwijkerhout, The Netherlands, 7-10 September 2004 Published: 30 December 2005 BMC Genetics 2005, 6(Suppl 1):S15 doi:10.1186/1471-2156-6-S1-S15 Genetic Analysis Workshop 14: Microsatellite and single-nucleotide polymorphism Joan E Bailey-Wilson, Laura Almasy, Mariza de Andrade, Julia Bailey, Heike Bickeböller, Heather J Cordell, E Warwick Daw, Lynn Goldin, Ellen L Goode, Courtney GrayMcGuire, Wayne H ning, ail Jarvik, Brion S Maher, Nancy Mendell, Andrew D Paterson, John Rice, Glen Satten, Brian Suar z, Veronica Vieland, Marsha Wilcox, Heping Zhang, Andre s Ziegler and Jean W MacCluer Proceedings BMC Genetics 2005, 6:S15 Page 2 of 6 (page number not for citation purposes) ms following stimulus presentation (bounding the visual P3 event), and the theta band power (3 to 7 Hz). ttdt_ contain extracted measures which the delta band power is 1 to 2.5 Hz with other conditions same as ttth_. The fields labeled ntth_ contain extracted measures from the nontarget case correspond to the 'early' time window, which is set at 100 to 300 ms following stimulus presentation, and the theta band power (3 to 7 Hz). The number following the four letters denotes the locations of the 4 electrode placements: 1 – FP1 (far frontal left side channel), 2 – FZ (frontal midline channel), 3 – CZ (central midline channel), 4 – PZ (parietal midline channel). This was followed by a common factor analysis in order to identify the underlying dimensions measured by the EEG data. We examined each of the phenotypes for normality before including it in the analysis. In the common factor model, each new phenotype is expressed as a linear combination of the original variables. The relationship of factors to the EEG phenotypes is reflected by factor loadings. The contribution of each factor to the set of variables is evaluated by eigenvalues. Based upon the distribution of the eigenvalues and the composition of the factors, we retained four factors. This solution accounted for 88% of the total variance. We used an oblique rotation of the factor solution. Factor scores were obtained using PROC FACTOR implemented in SAS (SAS version 8; SAS, Cary, NC). We treated each of the four factor scores as a new derived quantitative trait. Map construction Quantitative data usually provide more statistical power than a binary affection status. However, using the quantitative traits alone may still not be powerful enough to identify disease susceptibility genes for complex traits. Kruglyak predicted that using single-nucleotide polymorphism (SNPs) with a heterozygosity of 0.50 and approximately two to three times the density of the current microsatellite marker sets would achieve a similar result in linkage analysis as a genome scan with microsatellite markers [3]. Recently John et al. conducted a wholegenome scan using SNPs [4]. Their results showed that SNPs provided significantly higher information content than microsatellites and allowed loci to be defined more precisely. We hypothesized that there would also be higher information content, and better linkage signals for SNPs compared with microsatellites for quantitative traits. We carried out a whole-genome screen using 143 families from the Collaborative Study on the Genetics of Alcoholism (COGA) with four empirically derived quantitative traits (factor scores based upon the EEG data). Reformatted clean genotype data were provided by the COGA study, including 11,120 SNPs generated by Affymetrix GeneChip Mapping 10 K Array, 4,720 SNPs generated by Illumina, and 328 microsatellite markers spaced at 10-cM intervals across the genome. Both microsatellite and SNP genetic map positions were interpolated based upon the deCode genetic framework map, calculated based on their physical positions. Physical positions of SNPs were obtained from the NCBI database (release 34.3). SNPs with multiple physical map positions were dropped from the genetic map. All initial linkage analysis was performed using this adjusted map. Linkage disequilibrium (LD) Because linkage analysis algorithms assume linkage equilibrium between all markers, strong LD between SNPs may exaggerate the significance level of linkage and thus generate false positive results [5]. So we kept only one tag SNP in each haplotype block (SNPs in strong LD). The pairwise LD statistics D' and r2 were calculated for all SNPs by HAPLOVIEW (v3.0) [6]. Haplotype blocks were defined as regions over which a very small proportion (<5%) of comparisons among informative SNP pairs showed strong evidence of historical recombination [7]. Linkage analysis We performed variance components analysis for each factor by using SOLAR (v2.13) [8]. In variance components analysis, the total variance of each trait was decomposed into several sources by the following equation: Ω = Πσq + 2Φσg + Iσe, where Ω is the covariance matrix for a pedigree, Π is a matrix with elements πqij, which is the expected proportion of genes two individuals share as identical by descent (IBD) at specific chromosomal location, Φ is the kinship matrix, I is the identity matrix, σq is the variance component corresponding to the additive genetic effects from the major locus, σg is the variance component corresponding to the polygenic effects, and σe is the variance component corresponding to the environmental effects. The variance components analysis tested the null hypothesis that the additive genetic variance caused by the major quantitative trait locus (QTL) for a given trait equals zero (H0: σq = 0, or no linkage). The hypothesis testing was conducted by comparing the maximum likelihood of a restricted model in which σq was constrained to zero with a more general model in which σq was estimated, using the likelihood ratio test. Twice the difference of the natural logarithm likelihoods of the two models yields a test statistic that is asymptotically distributed as a 50/50 mixture of a χ2 and a point mass of zero. The log10 of the likelihood ratio between the two models yields a LOD score that is equivalent to the classical LOD score of linkage analysis [8]. The IBD matrix, multipoint IBD matrix, and heritability (h2) for each factor were estimated using SOLAR. BMC Genetics 2005, 6:S15 Page 3 of 6 (page number not for citation purposes) Results EEG measures and loadings on each of the four factors (F1, F2, F3, F4) obtained from factor analysis are shown in Table 1. Two alcoholism classifications were provided in the COGA data. ALDX1 was based on the DSM-III-R and the Feighner criteria. ALDX2 was defined by the DSM IV criteria. Table 2 shows the results of an analysis of variance (ANOVA) comparing the factor scores for affection status groups defined by ALDX1 and ALDX2. F3 (the three midline channel EEG measures from the target case of the Visual Oddball experiment ttdt2, 3, 4) was significant in both ALDX1 and ALDX2, indicating subjects with different affection status for alcohol dependence have different F3. Post-hoc comparisons using the Bonferroni method show that F3 was significantly higher in the unaffected with some symptoms group than in the affected group (p < 0.05). Similar patterns were seen in ttdt3 and ttdt4. We examined the heritability of each of the quantitative traits. Heritability for F1 (34.5 ± 6.6), F2 (32.1 ± 5.9), F3 (30.7 ± 6.2), and F4 (30.8 ± 6.7) was all significant (p < 0.001). We found significant evidence of linkage for F3 to chromosome 18 (LOD = 3.45 at 58 cM) in the Affymetrix SNP dataset. We had similar findings in the microsatellite (LOD = 2.73 at 61 cM) and Illumina SNP dataset (LOD = 3.30 at 56 cM) (Figure 1). Linkage peaks (LOD > 1.0) for each of the four factors are presented in Table 3. All genome scan results for each factor in each genotype dataset are shown in Figure 2. Discussion In the present study, our work suggests that there are four factors underlying the EEG measures. Among the four factors, factor 3 (F3), representing the midline measures (EEG ttdt2, 3, 4), was significantly different between affection status groups as defined by both ALDX1 and ALDX2. We found a novel genetic locus with significant evidence of linkage to F3 (EEG ttdt2, 3, 4) on chromosome 18, Multipoint LOD scores on chromosome 18 for trait F3 Figure 1 Multipoint LOD scores on chromosome 18 for trait F3. Multipoint LOD scores on chromosome 18 for trait F3 respectively using Affymatrix SNPs (red), Illumina SNPs (green) and microsatellites (blue) datasets. Table 2: Relationship between factors and affected status Factors p-value (ALDX1) p-value (ALDX2) F1 0.1603 0.0502 F2 0.1515 0.0271 F3 0.0018 0.0431 F4 0.4099 0.4800 Table 1: Factor loadings pattern – oblique rotation EEG phenotypes Factor loadings*

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Single Nucleotide Polymorphisms and Association Studies: A Few Critical Points

Uncovering DNA sequence variations that correlate with phenotypic changes, e.g., diseases, is the aim of sequence variation studies. Common types sequence variations are Single nucleotide polymorphism (SNP, pronounced snip).SNPs are the third-generation molecular marker. SNP represents a DNA sequence variant of a single base pair with the minor allele occurring in more than 1% of a given popula...

متن کامل

The Pattern of Linkage Disequilibrium in Livestock Genome

Linkage disequilibrium (LD) is bases of genomic selection, genomic marker imputation, marker assisted selection (MAS), quantitative trait loci (QTL) mapping, parentage testing and whole genome association studies. The Particular alleles at closed loci have a tendency to be co-inherited. In linked loci this pattern leads to association between alleles in population which is known as LD. Two metr...

متن کامل

DNA Polymorphisms at Candidate Gene Loci and Their Relation with Milk Production Traits in Murrah Buffalo (Bubalus bubalis)

DNA polymorphism within diacylglycerol transferase 2 (DGAT2) / monoacyl glycerol transferases 2 (MOGAT2), leptin and butyrophilin genes were analysed using PCR-SSCP in Murrah buffalo. The single strand conformation polymorphism (SSCP) analysis of amplified gene fragment in exon 5 of MOGAT2, exon 3 of leptin and intron 1 of butyrophilin gene revealed different patterns. A, B and C showed the fol...

متن کامل

Association of IGF-I Gene Polymorphisms with Carcass Traits in Iranian Mehraban Sheep Using SSCP Analysis

Molecular genetics selection on individual genes is a promising method to genetically improve economically important traits in livestock. The insulin like growth factor-I (IGF-I) gene may play important roles in growth of multiple tissues, including muscle cells, cartilage and bone. The objectives of the present study were the estimate the haplotype frequencies of the IGF-I gene polymorphisms i...

متن کامل

Genome-wide single-nucleotide polymorphism linkage analyses of quantitative rheumatoid arthritis phenotypes in Caucasian NARAC families

We applied nonparametric quantitative trait linkage analysis to two rheumatoid arthritis quantitative phenotypes, IgM rheumatoid factor (RF) and anti-cyclic citrullinated peptide autoantibody titer measurements, using 5700 genome-wide Illumina single-nucleotide polymorphism genotypes on 658 Caucasian North American Rheumatoid Arthritis Consortium families. Peak LOD scores for both quantitative ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005